Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review and clean up all Box-Cox functions #3080

Merged
merged 11 commits into from Dec 15, 2013
Merged

Conversation

rgommers
Copy link
Member

No description provided.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling fd448b9 on rgommers:boxcox into 572aaf0 on scipy:master.

Returns
-------
llf : float
Box-Cox log-likelihood of `data` given `lmbda`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should say "given lmb."

@WarrenWeckesser
Copy link
Member

This is great, Ralf! I made a few comments inline.

I think the method="all" option in boxcox_normmax is unnecessary. It is not a lot of code, but I don't think the small increase in convenience is worth it. But that's a judgment call, and I wouldn't hold up merging because of it.

I noticed a bug in boxcox: using a negative integer lambda with an integer array gives incorrect results:

In [27]: x
Out[27]: array([1, 2, 3])

In [28]: boxcox(x, -1)
Out[28]: array([-0.,  1.,  1.])

In [29]: boxcox(x, -1.0)
Out[29]: array([-0.        ,  0.5       ,  0.66666667])

In [30]: boxcox(x, -2)
Out[30]: array([-0. ,  0.5,  0.5])

In [31]: boxcox(x, -2.0)
Out[31]: array([-0.        ,  0.375     ,  0.44444444])

@WarrenWeckesser
Copy link
Member

I think the core Box-Cox function,

    y = np.where(lmbda == 0, log(x), (x**lmbda - 1) / lmbda)

should be implemented as a ufunc, and perhaps added to scipy.special. This would eliminate some of the spurious Runtime Warnings generated when a value is zero, and it could be reused in stats.power_divergence.

@WarrenWeckesser
Copy link
Member

I created a pull request that implements the core Box-Cox transformation as a ufunc in scipy.special: #3150

lmbda = lmbda*(x == x)
y = where(lmbda == 0, log(x), (x**lmbda - 1)/lmbda)
lmbda = lmbda * (x == x) # equal to ``lmbda * np.ones(x.shape)``
y = np.where(lmbda == 0, log(x), (x**lmbda - 1) / lmbda)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3150 was merged, so this line and the previous can be replaced by y = special.boxcox(x, lmbda).

…ndling.

Add also a reference to the original paper of Box and Cox.
Also some PEP8 cleanups in related functions.
boxcox() doesn't accept 2-D input, leave it like that.
Also refactor boxcox and boxcox_normmax so MLE can be used from the latter.

Changes the `brack` keyword in boxcox_normmax; it differed from that used in
boxcox(), keeping boxcox() results unchanged is preferred here.

This finishes the Statistics Review for all Box-Cox functions:
Closes scipygh-680
Closes scipygh-681
Closes scipygh-682
Closes scipygh-683
Closes scipygh-684
@rgommers
Copy link
Member Author

Re method='all': there are more methods that can easily be implemented, I just didn't get around to it. There's an R function (can't find it so quickly) that has seven options. So I'd like to keep 'all'.

@rgommers
Copy link
Member Author

Thanks for the review Warren. Addressed all your comments in the last commit.

"""Compute parameters for a Box-Cox normality plot, optionally show it.

A Box-Cox normality plot shows graphically what the best transformation
parameter is to use in `boxcox` for obtained a distribution that is close
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... to obtain a distribution ..." or perhaps "... for obtaining a distribution ..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 1e1f103 on rgommers:boxcox into 881fadf on scipy:master.

Addresses review comment on scipygh-3080.

[ci skip]
@WarrenWeckesser
Copy link
Member

Did the Travis tests run? I don't see the usual Travis status bar.

@rgommers
Copy link
Member Author

Yes they did for the second to last commit. The last commit is a one line change to a docstring, so I added [ci skip] to the commit message to not trigger Travis for it (no need to waste resources for zero code changes).

@WarrenWeckesser
Copy link
Member

OK, I see that now (and I just noticed the little green check mark next to the penultimate commit).

Merging.

WarrenWeckesser added a commit that referenced this pull request Dec 15, 2013
Review and clean up all Box-Cox functions
@WarrenWeckesser WarrenWeckesser merged commit 757211a into scipy:master Dec 15, 2013
@rgommers rgommers deleted the boxcox branch December 15, 2013 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants